Clustering from General Pairwise Observations with Applications to Time-varying Graphs
نویسندگان
چکیده
We present a general framework for graph clustering and bi-clustering where we are given a general observation (called a label) between each pair of nodes. This framework allows a rich encoding of various types of pairwise interactions between nodes. We propose a new tractable and robust approach to this problem based on convex optimization and maximum likelihood estimators. We analyze our algorithms under a general statistical model extending the planted partition and stochastic block models. Both sufficient and necessary conditions are provided for successful recovery of the underlying clusters. Our theoretical results subsume many existing graph clustering results for a wide range of settings, including planted partition, weighted clustering, submatrix localization and partially observed graphs. Furthermore, our results are applicable to novel settings including time-varying graphs, providing new insights to solutions of these problems. We provide empirical results on both synthetic and real data that corroborate with our theoretical findings.
منابع مشابه
Clustering from Labels and Time-Varying Graphs
We present a general framework for graph clustering where a label is observed to each pair of nodes. This allows a very rich encoding of various types of pairwise interactions between nodes. We propose a new tractable approach to this problem based on maximum likelihood estimator and convex optimization. We analyze our algorithm under a general generative model, and provide both necessary and s...
متن کاملGraph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملClustering via Random Walk Hitting Time on Directed Graphs
In this paper, we present a general data clustering algorithm which is based on the asymmetric pairwise measure of Markov random walk hitting time on directed graphs. Unlike traditional graph based clustering methods, we do not explicitly calculate the pairwise similarities between points. Instead, we form a transition matrix of Markov random walk on a directed graph directly from the data. Our...
متن کاملHierarchical Clustering using Randomly Selected Similarities
The problem of hierarchical clustering items from pairwise similarities is found across various scientific disciplines, from biology to networking. Often, applications of clustering techniques are limited by the cost of obtaining similarities between pairs of items. While prior work has been developed to reconstruct clustering using a significantly reduced set of pairwise similarities via adapt...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 18 شماره
صفحات -
تاریخ انتشار 2017